AITopics | range 0

Collaborating Authors

range 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models

Beurier, Gregory, Reiter, Robin, Noûs, Camille, Rouan, Lauriane, Cornet, Denis

arXiv.org Machine LearningMay-19-2026

Preprocessing screening is often the most expensive part of a near-infrared spectroscopy calibration workflow. It works because smoothing, derivatives, detrending and related filters change the spectral directions seen by PLS or Ridge regression, but a full external search repeatedly refits nearly the same linear model. This paper studies the case where that search can be collapsed into one calibration step. For strict linear preprocessing operators, the transformed PLS cross-covariance satisfies (X A^T)^T Y = A X^T Y, and Ridge regression depends on the operator-induced kernel X A^T A X^T. These identities allow a finite operator bank to be screened inside the model while retaining original-wavelength coefficients. Sample-adaptive or fitted corrections such as SNV, MSC, EMSC and ASLS remain fold-local branches, not absorbed into the algebra. The study uses the AOM benchmark cohort: 61 regression rows and 17 classification rows in the manifest. On the main regression denominator (N=32), plain compact-bank AOM-PLS records median RMSEP ratios of 0.991 against PLS-default and 0.990 against PLS-HPO; the selected ASLS-AOM-compact-cv5 branch records 0.985 and 1.002 on the same two references. The plain AOMRidge-global-compact-none baseline records 0.974 against Ridge-default and 0.984 against Ridge-HPO, while the selected AOMRidge-Blender-headline-spxy3 records 0.918 and 0.966. The selected classifier, AOM-PLS-DA-global-simpls-covariance, improves balanced accuracy by 0.159 on N=13 datasets with 12/13 wins. The runtime gap is the practical result: PLS-HPO takes a median total time of 710.81 s per run, whereas the selected AOM-PLS branch takes 1.63 s. Linear operator-adaptive calibration therefore gives comparable prediction quality to exhaustive preprocessing screening, with orders-of-magnitude less fitting time for PLS.

artificial intelligence, machine learning, variant, (18 more...)

arXiv.org Machine Learning

2605.13587

Genre: Research Report (1.00)

Industry: Health & Medicine (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Investigating Scale Independent UCT Exploration Factor Strategies

Schmöcker, Robin, Schnell, Christoph, Dockhorn, Alexander

arXiv.org Artificial IntelligenceOct-27-2025

The Upper Confidence Bounds For Trees (UCT) algorithm is not agnostic to the reward scale of the game it is applied to. For zero-sum games with the sparse rewards of $\{-1,0,1\}$ at the end of the game, this is not a problem, but many games often feature dense rewards with hand-picked reward scales, causing a node's Q-value to span different magnitudes across different games. In this paper, we evaluate various strategies for adaptively choosing the UCT exploration constant $λ$, called $λ$-strategies, that are agnostic to the game's reward scale. These $λ$-strategies include those proposed in the literature as well as five new strategies. Given our experimental results, we recommend using one of our newly suggested $λ$-strategies, which is to choose $λ$ as $2 \cdot σ$ where $σ$ is the empirical standard deviation of all state-action pairs' Q-values of the search tree. This method outperforms existing $λ$-strategies across a wide range of tasks both in terms of a single parameter value and the peak performances obtained by optimizing all available parameters.

artificial intelligence, machine learning, planning & scheduling, (19 more...)

arXiv.org Artificial Intelligence

2510.21275

Country:

Europe (1.00)
North America > United States (0.92)

Genre: Research Report (0.49)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluating Sparse Autoencoders for Monosemantic Representation

Fereidouni, Moghis, Haider, Muhammad Umair, Ju, Peizhong, Siddique, A. B.

arXiv.org Artificial IntelligenceOct-20-2025

A key barrier to interpreting large language models is polysemanticity, where neurons activate for multiple unrelated concepts. Sparse autoencoders (SAEs) have been proposed to mitigate this issue by transforming dense activations into sparse, more interpretable features. While prior work suggests that SAEs promote monosemanticity, no quantitative comparison has examined how concept activation distributions differ between SAEs and their base models. This paper provides the first systematic evaluation of SAEs against base models through activation distribution lens. We introduce a fine-grained concept separability score based on the Jensen-Shannon distance, which captures how distinctly a neuron's activation distributions vary across concepts. Using two large language models (Gemma-2-2B and DeepSeek-R1) and multiple SAE variants across five datasets (including word-level and sentence-level), we show that SAEs reduce polysemanticity and achieve higher concept separability. To assess practical utility, we evaluate concept-level interventions using two strategies: full neuron masking and partial suppression. We find that, compared to base models, SAEs enable more precise concept-level control when using partial suppression. Building on this, we propose Attenuation via Posterior Probabilities (APP), a new intervention method that uses concept-conditioned activation distributions for targeted suppression. APP achieves the smallest perplexity increase while remaining highly effective at concept removal.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.15094

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Post Hoc Regression Refinement via Pairwise Rankings

Wijaya, Kevin Tirta, Sun, Michael, Guo, Minghao, Seidel, Hans-Peter, Matusik, Wojciech, Babaei, Vahid

arXiv.org Artificial IntelligenceOct-2-2025

Accurate prediction of continuous properties is essential to many scientific and engineering tasks. Although deep-learning regressors excel with abundant labels, their accuracy deteriorates in data-scarce regimes. We introduce RankRefine, a model-agnostic, plug-and-play post hoc method that refines regression with expert knowledge coming from pairwise rankings. Given a query item and a small reference set with known properties, RankRefine combines the base regressor's output with a rank-based estimate via inverse variance weighting, requiring no retraining. In molecular property prediction task, RankRefine achieves up to 10% relative reduction in mean absolute error using only 20 pairwise comparisons obtained through a general-purpose large language model (LLM) with no finetuning. As rankings provided by human experts or general-purpose LLMs are sufficient for improving regression across diverse domains, RankRefine offers practicality and broad applicability, especially in low-data settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.16495

Genre: Research Report (0.82)

Industry:

Education (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Predictive Multiplicity of Knowledge Graph Embeddings in Link Prediction

Zhu, Yuqicheng, Potyka, Nico, Nayyeri, Mojtaba, Xiong, Bo, He, Yunjie, Kharlamov, Evgeny, Staab, Steffen

arXiv.org Artificial IntelligenceAug-15-2024

Knowledge graph embedding (KGE) models are often used to predict missing links for knowledge graphs (KGs). However, multiple KG embeddings can perform almost equally well for link prediction yet suggest conflicting predictions for certain queries, termed \textit{predictive multiplicity} in literature. This behavior poses substantial risks for KGE-based applications in high-stake domains but has been overlooked in KGE research. In this paper, we define predictive multiplicity in link prediction. We introduce evaluation metrics and measure predictive multiplicity for representative KGE methods on commonly used benchmark datasets. Our empirical study reveals significant predictive multiplicity in link prediction, with $8\%$ to $39\%$ testing queries exhibiting conflicting predictions. To address this issue, we propose leveraging voting methods from social choice theory, significantly mitigating conflicts by $66\%$ to $78\%$ according to our experiments.

borda 0, multiplicity, range 0, (15 more...)

arXiv.org Artificial Intelligence

2408.08226

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government > Voting & Elections (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Congestion and Scalability in Robot Swarms: a Study on Collective Decision Making

Soma, Karthik, Vardharajan, Vivek Shankar, Hamann, Heiko, Beltrame, Giovanni

arXiv.org Artificial IntelligenceJul-17-2023

One of the most important promises of decentralized systems is scalability, which is often assumed to be present in robot swarm systems without being contested. Simple limitations, such as movement congestion and communication conflicts, can drastically affect scalability. In this work, we study the effects of congestion in a binary collective decision-making task. We evaluate the impact of two types of congestion (communication and movement) when using three different techniques for the task: Honey Bee inspired, Stigmergy based, and Division of Labor. We deploy up to 150 robots in a physics-based simulator performing a sampling mission in an arena with variable levels of robot density, applying the three techniques. Our results suggest that applying Division of Labor coupled with versioned local communication helps to scale the system by minimizing congestion.

artificial intelligence, congestion, robot, (15 more...)

arXiv.org Artificial Intelligence

2307.08568

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

Learning Image Classification with CNN using TensorFlow

#artificialintelligenceFeb-12-2023, 13:05:11 GMT

In this article we will work with an image dataset to train an Image classifier using a custom CNN built with TensorFlow. PS: For those who don't already know what is Deep learning or CNN this article may be difficult to understand and unfortunately there is no easier way around this. This article is not meant to be a tutorial about Computer Vision or Deep Learning, For those familiar with these concepts please read on. We will work with a dataset provided here. This dataset is a curated nicely, cleaned and arranged collection of roasted coffee beans in train and test folders.

dataset, learning image classification, tensorflow, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Using Keras ImageDataGenerator with Transfer Learning

#artificialintelligenceSep-7-2020, 08:05:10 GMT

This line of code is used to define the transformations that the training DataGenerator will apply on all the images to augment the size of the dataset. For the validation DataGenerator, we only specify the scaling factor. The other transformations are not required because we are not training the model on this data. Next, we define the Model. We set layer.trainable False for each layer of the VGG model, as we are using the pre-trained weights of the model.

artificial intelligence, imagedatagenerator, machine learning, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Deep learning for lung cancer prognostication: A retrospective multi-cohort radiomics study

#artificialintelligenceDec-9-2018, 20:09:51 GMT

We performed an integrative analysis on 7 independent datasets across 5 institutions totaling 1,194 NSCLC patients (age median 68.3 years [range 32.5–93.3], Using external validation in computed tomography (CT) data, we identified prognostic signatures using a 3D convolutional neural network (CNN) for patients treated with radiotherapy (n 771, age median 68.0 years [range 32.5–93.3], We then employed a transfer learning approach to achieve the same for surgery patients (n 391, age median 69.1 years [range 37.2–88.0], We found that the CNN predictions were significantly associated with 2-year overall survival from the start of respective treatment for radiotherapy (area under the receiver operating characteristic curve [AUC] 0.70 [95% CI 0.63–0.78], The CNN was also able to significantly stratify patients into low and high mortality risk groups in both the radiotherapy (p 0.001) and surgery (p 0.03) datasets.

artificial intelligence, machine learning, retrospective multi-cohort radiomic study, (11 more...)

#artificialintelligence

Genre: Research Report > Experimental Study (0.92)

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback